Methods in Ecology and Evolution — Latest Matching Preprints

1

Ecological connectivity modelling with WebAssembly

Southgate, A. J.; Redihough, J.

2026-07-09 ecology 10.64898/2026.07.08.737333 medRxiv

Top 0.1%

49.3%

Show abstract

Circuit theory has been successfully applied to ecological connectivity modelling, notably via the Circuitscape software, which is typically run locally on a laptop or via a server. For downstream geospatial web applications relying on connectivity analysis, backend infrastructure is required, which can be costly and require advanced data governance. Recent developments in WebAssembly now allow fast C++ or Rust code to be run directly in a sandboxed browser environment for edge computing. We present a WebAssembly/Rust toolset with a geospatial data pipeline and efficient edge-computing implementation of connectivity analysis. This approach may be useful for geospatial modelling software where rasters and memory footprint are small enough for the browser context. Our results show that as expected, Circuitscape solves 1000x1000 raster networks 1-2x faster, but requires further file writes. Accounting for total program runtime, our web implementation can be faster for the given context.

2

Segmentation and profile-based classification of movement strategies from animal tracking data

Kadlec, I.; Bartak, V.; Selimovic, A.; Kutal, M.; Dula, M.; Stier, N.; Meissner-Hylanova, V.; Peskova, L. B.; Sladecek, M.; Vorel, A.; Signer, J.

2026-05-14 ecology 10.64898/2026.05.13.724011 medRxiv

Top 0.1%

46.7%

Show abstract

O_LIClassifying animal movement strategies from GPS tracking data is essential for understanding space use, population dynamics and conservation planning. However, existing approaches either require strong parametric assumptions about trajectory shape, large labelled datasets (i.e. expert-annotated) for machine learning, or lack formal uncertainty quantification. These limitations create barriers for researchers working with novel species or limited sample sizes. C_LIO_LIWe present a profile-based classification framework consisting of three steps. First, trajectories are segmented using breakpoint detection applied to Net Squared Displacement (NSD) time series. Movement metrics are then extracted from each segment and classified by comparing them to empirically derived behavioural profiles via Z-score distances transformed to softmax probabilities. Bootstrap resampling quantifies uncertainty in the resulting classifications from both training and test data. We validated the framework through simulation experiments and applied it to GPS tracking data from two ecologically contrasting species: gray wolf (Canis lupus;43 individuals) and northern lapwing (Vanellus vanellus;15 individuals). C_LIO_LISimulations showed that 5-10 training segments per movement strategy suffice for reliable classification, with overall accuracy of 91.1%across residential, floating and dispersal strategies. Segment duration of 30-60 days was required for confident discrimination of residential and floating behaviour. For wolves, the framework clearly distinguished residency, floating or dispersal (91.2%of segments classified with >50%probability). For lapwings, migration was identified with high confidence, while residential-floating discrimination reflected genuine ecological ambiguity confirmed by domain experts, with bootstrap confidence intervals transparently flagging uncertain cases. C_LIO_LIThe profile-based framework provides an accessible, interpretable alternative to parametric NSD fitting and machine learning approach, requiring modest training data while delivering probabilistic classifications with honest uncertainty estimates. An R package (moveprofile) implementing the complete workflow is freely available. The framework is applicable to any tracked species where distinct movement strategies can be identified by experts knowledge. C_LI

3

move2utils: a utility toolkit for the move2 ecosystem

Kranstauber, B.; Safi, K.; Scharf, A. K.

2026-07-10 ecology 10.64898/2026.07.07.736908 medRxiv

Top 0.1%

37.7%

Show abstract

O_LIStudying animal movement at the population scale requires a stable, modern software substrate. Within R, the legacy move package supplied that substrate for over a decade, but its sp/rgeos backbone has been retired. The successor package move2 deliberately confined its scope to the data class and core movebank API functions. C_LIO_LIThe analytical machinery of move, namely dynamic Brownian-bridge utilisation distributions, the directional bivariate-Gaussian variant, corridor segmentation, and along-track thinning, was left to port to the modern sf/terra stack. C_LIO_LIWe present move2utils, an R package that completes and complements that transition. move2utils provides move2-native ports of the move analytical functions, preserves the original C kernels where they exist, and replaces the deprecated spatial scaffolding around them. It additionally ports some of the legacy R-based code to faster C kernels to improve computational speed. move2utils also exposes novel outlier-detection methodology described in detail in a companion paper. C_LIO_LIThe package is open-source (GPL [≤] 3), is developed on the MPCDF GitLab and mirrored on GitHub for public installation, and ships with vignettes and a CI-tested check suite. We illustrate it with a worked example on real tracking data and synthetic datasets. C_LI

4

Estimation, testing, and inference of network heterogeneity

Ma, Z.; Ellison, A. M.

2026-05-11 ecology 10.64898/2025.12.18.695221 medRxiv

Top 0.1%

32.8%

Show abstract

O_LIDiversity and heterogeneity are related but distinct and often conflated concepts. Diversity quantifies the number or relative abundance of discrete objects (e.g. species), whereas heterogeneity includes interactions among them (i.e. in networks) and between them and their environments. Although estimation, testing, and inference of diversity is well established and understood in ecology, comparable methods for heterogeneity are themselves diverse and rarely applied consistently or coherently. C_LIO_LIWe propose a consistent and coherent methodology for estimation, testing, and inference of heterogeneity of ecological networks. Estimation of heterogeneity is scalable from individuals to populations using the variance-to-mean (V/M) ratio and extensions of Taylors power law (TPL) to analyzing networks. Bootstrapping is used to partition heterogeneous and random clusters, whereas permutation tests are used to compare individual- and network-level heterogeneity. Inference includes the identification of "important" (e.g. dominant, foundation, keystone) species and "rich clubs" in heterogeneous networks, detection of biomarkers, and analysis of heterogeneity-stability relationships. C_LIO_LIWe demonstrate this methodology using the global Earth Microbiome Project dataset. The method could reliably distinguish heterogeneous nodes and networks; identified significant differences in heterogeneity among microbial assemblages in different habitats and in specific sites within habitats; and supported established principles of host filtering, species sorting, and niche partitioning. C_LIO_LIOur methods for estimation, testing, and inference of heterogeneity are modular, scalable, and applicable to a wide range of ecological systems. They also provide a quantitative method for understanding how evolutionary and ecological forces jointly shape both topology and heterogeneity in ecological networks. C_LI

5

Data requirements for accurate extinction-risk prediction in bistable populations

Rajakumar, A.; Buenzli, P. R.; Simpson, M. J.

2026-06-22 ecology 10.64898/2026.06.19.733461 medRxiv

Top 0.1%

31.9%

Show abstract

Understanding and predicting extinction risk is a central challenge in population biology. Mathematical models incorporating Allee thresholds are commonly used to understand population dynamics and to assess extinction risks. Inaccurate predictions can have serious consequences for conservation management. In this simulation study, we develop a likelihood-based inference and prediction workflow to estimate parameters, including the Allee threshold and population diffusivity parameters, using noisy count data generated using a well-defined discrete model. Although parameters are identifiable according to commonly used criteria, the accuracy of resulting predictions depends strongly on the quantity, quality, collection time and spatial resolution of the data. Our workflow demonstrates that seemingly reliable parameter estimates can lead to inaccurate predictions, highlighting the need for careful consideration of data quality and quantity to guide extinction-risk modelling and prediction. Open source software is provided on GitHub to replicate and extend all results considered.

6

Automated Parameter Estimation for Camera Trap Density Models Using Computer Vision-Enhanced Distance Sampling

McMurry, S.; Alyetama, M.; Goldstein, B.; Kays, R.

2026-06-16 ecology 10.64898/2026.06.14.732225 medRxiv

Top 0.1%

30.6%

Show abstract

Models for estimating animal density from camera traps require four parameters informing detection: movement speed, daily activity level, staying time (duration animals remain within the detection zone), and effective detection distance. These parameters traditionally come from labor-intensive manual measurements and auxiliary telemetry. Recent advances in computer vision can provide the positions of animals in camera trap images, which have been used for distance sampling. We extend this approach to extract all four parameters from imagery, providing the first AI-derived estimates of movement speed and staying time from automated coordinate tracking. We also introduce a new joint multi-species hierarchical distance function that estimates deployment-specific effective detection distances while borrowing strength across species through partial pooling. Our pipeline integrates MegaDetector for animal detection, the Segment Anything Model for segmentation, and Dense Prediction Transformers for monocular depth estimation. From frame-level coordinates, we reconstruct movement trajectories across burst sequences to estimate speed with size-biased distribution corrections, calculate staying time through bounding box interpolation, and estimate activity levels from detection timestamps. The joint hierarchical distance function decomposes the detection scale parameter into a shared deployment-level effect and species-specific offsets, so species effects represent deviations from the multi-species average, allowing data-rich species to inform detection conditions where rare species have few observations. AI-derived scene depth enters the model as a covariate on detection range, providing a vegetation openness metric from the same pipeline. To address position errors from depth estimation, we apply data quality filters. We processed 122,574 frames from 181 deployments across montane forests in Washington and Montana, generating parameter estimates for 12 species without manual annotation. Automated speed estimates produced day ranges 2.7 to 4.3 times GPS telemetry-derived daily distances, reflecting differences between encounter velocity within detection zones and landscape-scale displacement. Deployment-level variation in detectability exceeded species-level differences 3:1, with scene depth strongly predicting detection range; mean effective detection distances ranged from 4.1 to 7.6 m. Applied to a Random Encounter Model, these parameters yielded a white-tailed deer density estimate of 21.4 animals/km{superscript 2} and the Random Encounter Staying Time model yielded 11.6animals/km{superscript 2} in Montana. This pipeline enables scalable density estimation across large camera trap networks.

7

Towards a general Detector of terrestrial Arthropods in Natural backgrounds

Remy, E.; Carlier, A.; Massol, E.; Kacimi, R.; Chaine, A. S.; Cauchoix, M.

2026-05-08 ecology 10.64898/2026.05.06.723207 medRxiv

Top 0.1%

30.3%

Show abstract

Widespread arthropod declines pose risks to ecosystem functioning and agriculture. Assessing this decline or potential remediation implies the need for standardized and scalable population monitoring. Image-based methods, including camera traps and citizen science programs, are increasingly used, but the volume of data collected requires automated analysis. Robust arthropod detection is essential for individual counting or fine-grained classification, yet current datasets and algorithms do not address the vast morphological diversity across arthropod species and often overlook the variety of photographic contexts, such as differences in background, lighting, and image composition, in which arthropods are captured. To address this gap, we developed an arthropod detection dataset, covering all terrestrial families present in France with available validated images on the iNaturalist platform (749 families). To achieve this, we employed an iterative workflow in which a YOLOv11 model pre-annotated images -- using one representative species per family-- followed by manual correction and model retraining. Repeating this process progressively reduced annotation effort and improved model accuracy. The final outcome consists of a publicly available curated detection dataset and a robust arthropod detector for natural background scenes. The detector achieves an F1-score of 0.91, demonstrating strong performance despite substantial interspecific morphological variation and heterogeneity in photographic contexts. We further demonstrated the taxonomical universality of the model showing high F1-score and IoU averaged at the class (0.79, 0.85) and order level (0.82, 0.86) and also a good detection generalizability (F1-score>0.90, IoU>0.83) on species, genera and families never encountered by the model during training. Finally, we show how this model can be improved to generalize to new datasets using data augmentation, complementary training data or fine-tuning and increase detection of small objects. In particular, we report performance of the improved models on three use cases largely used in non lethal insect monitoring: (i) diurnal pollinator monitoring through citizen science or (ii) flower and nocturnal insects monitoring through smartphone time-lapse of a UV-illuminated white panel. These results mark an important step toward automated analysis of arthropod images in natural contexts, from both large-scale automated monitoring approaches or from citizen science monitoring programs.

8

NicheDiv: A DAPC framework to quantify niche divergence across highly multivariate environmental space

Schoenberger, D.; MacDonald, Z. G.; Schmidt, B. C.; Dupuis, J. R.

2026-06-22 ecology 10.64898/2026.06.19.733388 medRxiv

Top 0.1%

30.1%

Show abstract

Quantifying niche divergence is crucial to understanding the ecological and evolutionary processes underlying range limits, coexistence, speciation, biogeography, and macroevolution. Yet available approaches rely on low-dimensional climate summaries, are vulnerable to multiple biases, or struggle with high-dimensional collinear data. We introduce the R package NicheDiv, which adapts discriminant analysis of principal components (DAPC) to quantify pairwise niche divergence across any number of abiotic and biotic environmental variables associated with occurrence records. Our method first addresses correlations among environmental variables through principal component analysis. It then identifies a single discriminant axis that maximizes separation between predefined groups (species/lineages/populations), summarizing multivariate niche structure into one dimension. Significance is assessed by a permutation test that reshuffles group identities to mimic a shared niche. To characterize ecological differentiation, NicheDiv calculates Schoeners D as an overlap index and extends the niche divergence plane to multivariate space, providing metrics such as niche dissimilarity and exclusivity. Extracted variable contributions from the discriminant axis identify environmental variables that contribute most to divergence. Using simulations and empirical data together with a large set of environmental layers, we demonstrate that NicheDiv is computationally scalable, detects subtle divergence in high-dimensional space despite multicollinearity, distinguishes different forms of niche divergence (weighted, nested, soft, hard), and identifies the variables that potentially drive divergence. Compared with alternative divergence tests (PCA-env, hypervolumes, MVNH, PERMANOVA, PCA-space, and logistic regression), NicheDiv generally retains more variation, scales more consistently with increasing divergence, and returns more interpretable effect sizes. NicheDiv automatically extracts such environmental data from preconfigured and user-supplied GIS layers and implements a preprocessing pipeline that reduces known biases: delimiting accessible background space, spatially thinning occurrences, balancing sample sizes, filtering low-information variables, and screening predictors for between-group environmental analogy. We test our framework with empirical analyses of Hemileuca buck moths and demonstrate that their niches are structured by a range of seasonal abiotic and biotic variables rather than annual climatic averages. Overall, NicheDiv offers a robust framework for characterizing niche divergence across multiple environmental axes in support of species delimitation, local adaptation, community ecology, biogeography, and macroevolution.

9

Monkey hear, monkey do what? An application of Automated Behavioural Response systems for hypothesis testing in the worlds smallest monkey

Barker, L.; Papworth, S. K.

2026-06-01 animal behavior and cognition 10.64898/2026.05.28.728389 medRxiv

Top 0.1%

28.7%

Show abstract

Observer effects are a frequent problem in animal behaviour studies, particularly when assessing responses to human disturbance. Automated Behavioural Response (ABR) systems, which combine camera traps with automated sound playbacks, offer a solution but have been primarily used on large terrestrial mammals. Here, we demonstrate their use in a small ([~]110g) arboreal primate, the eastern pygmy marmoset (Cebuella niveiventris). We conducted two playback experiments to test the risk-disturbance and distracted prey hypotheses. The marmosets exhibited strong anti-predator responses to avian predator calls, including increased fleeing and vocalisations. Human speech elicited similar but weaker responses, indicating that pygmy marmosets do not perceive raptors and humans as equivalent threats. Embedding predator calls into anthropogenic noise reduced vocal responses, suggesting that anthropogenic noise interferes with responses to predation cues. Across five weeks, we generated 128 successful experimental trials, demonstrating that ABRs can rapidly produce sample sizes sufficient for hypothesis testing in the field.

10

An Open Reproducible Framework for CNN-Based Cetacean Vocalization Detection in Passive Acoustic Monitoring

De Marco, R.

2026-05-06 animal behavior and cognition 10.64898/2026.05.01.721665 medRxiv

Top 0.1%

26.7%

Show abstract

This paper presents a six-stage methodological framework for Convolutional Neural Net-work (CNN)-based cetacean vocalization detection and classification in Passive Acoustic Monitoring (PAM), implemented as the open-source toolkit ai-pam-pipeline. The frame-work is generalizable across species and fully parameterised through a single configuration file, guaranteeing exact experimental reproducibility. Two experiments are reported. Experiment A examines the effect of FFT window length Nfft [isin] {256, 512, 1024} on binary Bottlenose dolphin (Tursiops truncatus) whistle detection using stratified 10-fold cross-validation on an in-domain dataset (Oltremare, 192 kHz) and a cross-domain benchmark (DCLDE 2022). In-domain performance is uniformly high (macro F1{approx} 0.98; Wilcoxon, all p > 0.05). Cross-domain results diverge substantially: Nfft = 256 is significantly superior (p = 0.006, rank-biserial r = 0.89). The mechanism is an upsampling amplification effect: coarser spectral bins produce wider, higher-contrast FM traces after bilinear resampling to fixed image dimensions. This superiority is threshold-invariant: precision equals 1.000 across all configurations and thresholds{theta} [isin] [0.1, 0.9], confirming that the advantage is not an artifact of threshold choice. These findings demonstrate that preprocessing choices -- often treated as secondary implementation details -- can significantly affect cross-domain generalisation. While Nfft serves here as a controlled case study, the framework is designed to enable systematic, reproducible evaluation of arbitrary preprocessing parameters within a unified experimental protocol. Experiment B demonstrates multiclass capability on five T. truncatus vocalization cate-gories (macro F1 = 0.843); inter-class confusion between click trains and burst-pulse sounds reflects biological signal overlap rather than classifier failure.

11

EcoMorph: Universal morphological trait quantification from natural language prompts for ecological research

Amoah, E. I.; Bunch, Z.; Thomas, H. M.; Patch, H. M.; Grozinger, C.

2026-07-12 bioinformatics 10.64898/2026.07.10.737871 medRxiv

Top 0.1%

26.5%

Show abstract

0.O_LIMorphological traits such as floral area and body size are fundamental to ecological research, serving as inputs for studies of pollinator-plant interactions, habitat quality, and biodiversity monitoring. However, accurately measuring these traits from images remains challenging, particularly in complex field conditions where existing tools exhibit reduced accuracy and limited generalizability across taxa. C_LIO_LIWe present EcoMorph, a modular morphological measurement system that leverages the Segment Anything Model 3 (SAM3) to quantify traits across diverse ecological contexts. Unlike task-specific segmentation models requiring domain-specific training data, SAM3s prompt-based architecture enables segmentation of arbitrary biological structures from natural-language prompts, using the same underlying model across flowers, insects, and other targets without retraining. From the resulting segmentations, EcoMorph extracts three classes of measurement: area, linear dimensions, and object counts. C_LIO_LIWe validated EcoMorph across two ecological scales. At the intermediate scale, EcoMorph-derived floral area agreed closely with manual ImageJ measurements (R2 = 0.935, n = 74) under simple-background conditions and (R2 = 0.928, n = 58) under complex-background conditions, with valid predictions for 95% of images. At the fine scale, EcoMorph-derived insect body area was strongly correlated with hand-measured intertegular distance (r = 0.810, n = 349), capturing body-size variation across species from the small Bombus impatiens to the large Xylocopa virginica. Object counts matched manual counts almost exactly for well-separated insects in an insect box (R2 = 0.9997, n = 12). C_LIO_LIBy combining prompt-based segmentation with modular measurement, EcoMorph enables high-throughput quantification of area, size, and abundance from heterogeneous image sources without taxon-specific training. This generality supports a broad range of ecological applications, including pollinator and plant trait research, biodiversity and abundance monitoring, and allometric biomass estimation. C_LI

12

InsectDCT: A generalized pipeline for detection, taxonomic classification, and tracking of insects in camera-trap recordings

Bjerge, K.; Wogram, S. F. A.; Serra-Marin, P. E.; Sakhiashvili, O.; Hoye, T. T.

2026-07-10 ecology 10.64898/2026.07.07.736939 medRxiv

Top 0.1%

22.6%

Show abstract

Automated monitoring of insect pollinators in natural environments with insect camera traps and trained deep learning algorithms provides novel data for insect ecological studies. However, efficient and accurate image recognition analysis of the recorded images or videos is challenging, particularly for images containing small insects against complex backgrounds with diverse vegetation communities. Even when insects can be detected in images, identifying their taxonomy remains difficult, particularly in footage with low image resolution, light conditions, and distances from the plants, and in cases where insects appear blurry or only partially visible. In this work, we present InsectDCT, an AI-based pipeline for automated detection, hierarchical classification, and tracking of insects in footage of natural vegetation tested in different environments. The InsectDCT pipeline consists of three levels: insect Detection and localization, hierarchical taxonomic Classification, and spatio-temporal Tracking. In the first stage, insects are detected in time-lapse images or video recordings using the You Only Look Once (YOLO11) object detection architecture. Detection performance is improved using motion-enhanced images, which improve robustness in cluttered and 3 dimensional environments. The detector is trained on an extensive dataset that contains more than 60,000 images collected using camera traps deployed across a wide range of plant families and floral habitats. In the second stage, detected insects are classified using a hierarchical taxonomy-aware classification framework that covers 80 taxonomic groups. Classification is performed at multiple taxonomic levels, including order, family, and genus/species, allowing coarse and fine-grained ecological analyzes while accounting for varying levels of visual ambiguity. In the third stage, a multi-object tracking module is applied to high temporal-resolution image sequences and video data to associate detections of the same individual across time. InsectDCT code and all datasets are made publicly available. Author summaryInsects are declining worldwide, creating an urgent need for efficient methods to monitor their abundance, activity, and diversity. Traditional insect surveys often require extensive fieldwork and expert taxonomic identification, which limits the scale and frequency of monitoring. In this study, we developed InsectDCT, an artificial intelligence-based pipeline that automatically detects, classifies, and tracks insects in camera-trap recordings collected from natural and semi-natural environments. Our approach combines deep-learning methods for object detection, hierarchical taxonomic classification, and tracking of individual insect observations through time. Unlike many existing systems that are trained for a single habitat or plant species, we designed our framework using images collected across a wide range of flowering plants, camera systems, and insect groups. This makes the system more transferable to new ecological settings. The classifier can identify insects at multiple taxonomic levels and can return higher-level classifications when species-level identification is uncertain. We demonstrate that the pipeline can process large image datasets efficiently, including on low-power edge-computing devices such as Raspberry Pi systems. By providing both the software and the underlying datasets, we aim to support scalable, non-invasive insect monitoring and facilitate future ecological and conservation research.

13

eCOMET: An R package for evaluating metabolic diversity and enrichment from LC-MS/MS data to test ecological hypotheses from individuals to ecosystems

Choi, M.-S.; Forrister, D. L.; Dury, G. J.; Kang, K. B.; Sedio, B. E.; Joo, Y.

2026-06-05 ecology 10.64898/2026.06.02.729701 medRxiv

Top 0.1%

22.3%

Show abstract

O_LIMethods in metabolomics have grown exponentially in recent years, providing new insight into the ecological function and evolutionary impact of diverse plant metabolites. Metabolomics requires a command of numerous tools, the outputs of which are typically integrated through in-house, custom code that presents a workflow bottleneck and a barrier to entry for researchers in ecology, evolution, and behavior who may benefit from adding a metabolomics perspective to their research. C_LIO_LIWe introduce eCOMET, an R package for integrating and harmonizing the outputs of common metabolomics bioinformatics tools and conducting statistical analyses and data visualization methods useful for ecological metabolomics. C_LIO_LIOur package combines metabolome feature metadata with quantification tables (e.g., mzmine), feature dissimilarity matrices (e.g., modified cosine and DreaMS), and feature annotations (e.g., SIRIUS) into a cohesive R data object to facilitate downstream analyses, including the calculation of diversity and disparity metrics and differential accumulation analysis. C_LIO_LIOur goal is to make metabolomics accessible to a wider range of researchers in ecology, evolution, and behavior to unlock the potential of ecological metabolomics to generate novel insight in these fields. C_LI

14

Linking automated image analysis to ecological inference: high-throughput monitoring of soil fauna

Hendrikx, H.; Belaud, E.; Postic, F.; Scalabrino, M.; Lebeau, M.; Le Maire, G.; Jourdan, C.; Gallet, P.; Hedde, M.

2026-06-16 ecology 10.64898/2026.06.16.732537 medRxiv

Top 0.1%

22.3%

Show abstract

1 - Automated in situ sensors - e.g., buried scanners - are transforming biodiversity monitoring by generating data at spatio-temporal resolutions unattainable through traditional sampling, including in cryptic environments such as soil that have remained largely inaccessible to existing methods. However, extracting ecologically meaningful information from these data streams requires substantial image processing effort that currently constitutes a critical bottleneck, particularly when the signal-to-noise ratio is low and annotated training data are scarce. 2 - Standard end-to-end deep learning detection pipelines offer unsatisfactory results due to the lack of training data and heterogeneity of the taxa of interest. We explore the potential of combining traditional computer vision algorithms with state-of-the-art deep learning models to build an efficient raw data processing pipelines from limited annotation effort. Specifically, based on the observation that the background barely changes, we focus on the differences between two consecutive images to turn the initial detection problem (with very low signal) into a simpler classification problem, which we solve by fine-tuning foundation models on limited annotated data. 3 - Our approach significantly reduces the annotation effort, allowing us to release an open dataset with about 600 soil scans and more than 8 000 labeled invertebrate occurrences across nine taxa. Using this dataset to train our models, we obtained population count estimates with relative errors ranging from 10% to 61% across taxa over a three-month period. Ecological validation through a land-use stability analysis showed full directional congruence between automated and expert-annotated classifications across all nine taxa examined, with effect-size discrepancies proportional to per-taxon classification accuracy. 4 - These results demonstrate that combining domain-specific heuristics with fine-tuned foundation models provides an effective and data-efficient strategy for automating ecological image processing workflows in low-signal, data-scarce contexts. The validated pipeline removes the manual annotation bottleneck that has historically limited scanner-based soil monitoring to short observational windows and restricted taxonomic scope, opening the way for continuous, large-scale tracking of soil invertebrate community dynamics at resolutions previously unachievable.

15

WolfPackR: An R package for identifying wolf packs based on genetic and spatial data

Boncourt, E.

2026-04-29 ecology 10.64898/2026.04.28.721440 medRxiv

Top 0.1%

21.8%

Show abstract

The global expansion of grey wolf (Canis lupus) populations, particularly in Europe, underscores the need for robust tools to study their social structure, territory use, and genetic relatedness. Wolf packs are dynamic, evolving through dispersal, mortality, and reproductive success, and their accurate identification is crucial for effective conservation and conflict mitigation. Traditional methods for estimating wolf populations and pack structures--such as snow tracking or howling surveys--are labor-intensive and often unreliable. Noninvasive genetic sampling and spatial capture-recapture models have improved monitoring, but integrating genetic and spatial data remains a challenge. We introduce WolfPackR, an R package designed to integrate genetic relatedness and spatial data for identifying wolf packs, lone individuals, and spatially isolated but genetically linked "ugly ducklings." WolfPackR uses pairwise relatedness estimators to define genetic groups and refines these groups through spatial overlap analysis based on Minimum Convex Polygons (MCPs). The package provides a comprehensive toolkit for analyzing population structure, territoriality, and social organization, including functions for genetic grouping, spatial clustering, summary statistics, and interactive visualization. We demonstrate the utility of WolfPackR using a case study of 505 genotyped and geospatialized wolf scat samples from Romania. By combining genetic and spatial data, WolfPackR accurately identifies pack structures that align with expert assessments and family tree reconstructions. The package modular design and reliance on widely used R libraries (dplyr, igraph, sf, leaflet) ensure flexibility and ease of integration into existing workflows. While sampling heterogeneity may limit territory delineation in some cases, WolfPackR offers a cost-effective and reproducible framework for studying wolf pack dynamics, with potential applications for other social species.

16

eeeHive: a new HF RFID-based automated behavioral monitoring system for group-housed animals with high spatiotemporal resolution

Benner, S.; Shiono, S.; Kagawa, T.; Hattori, K.; Yamasue, H.; Lipp, H.-P.; Endo, T.

2026-05-05 animal behavior and cognition 10.64898/2026.04.30.720993 medRxiv

Top 0.1%

19.5%

Show abstract

Long-term, automated tracking of group-housed social animals using RFID (radio frequency identification) is a promising approach in ethological neuroscience. However, low-frequency (LF) RFID, while long-established in the field, is constrained by its inherent low data rates, which lead to two critical limitations: (1) compromised spatiotemporal resolution, and (2) the inability to identify multiple tags (animals) simultaneously. To address these limitations, we developed eeeHive, a high-frequency (HF) RFID-based animal tracking system with a fully custom hardware architecture that enables high-speed, multiplexed antenna polling and concurrent multi-tag reading. The polling time per antenna in eeeHive was 5.9 ms, with an additional 8.2 ms read time per tag. We applied the system to track 24 mice for one week, and six common marmosets for seven weeks. The system successfully tracked individuals even within dense clusters, revealing complex behavioral traits characterized by spatial utilization, temporal dynamics, behavioral regularity, and inter-individual relationships. Additional tests with Japanese fire-bellied newts and Nile tilapia juveniles demonstrated comparable tracking performance in aquatic environments. Taken together, eeeHive overcomes the inherent limitations of conventional LF RFID, establishing a powerful HF RFID-based platform for fine-scale behavioral tracking of group-housed animals across terrestrial and aquatic species.

17

An Automated Wireless Seesaw System Enabling Spatial Separation of Action and Reward in Group-Housed Marmosets

Cabrera-Moreno, J.; Burkart, J. M.; Bruegger, R. K.

2026-06-07 animal behavior and cognition 10.64898/2026.06.02.728149 medRxiv

Top 0.1%

19.1%

Show abstract

Cooperation in social species is shaped by ongoing social relationships, partner choice, and group interactions, demanding experimental systems that preserve the social context in which these behaviors unfold. Here we introduce the e-Seesaw, a wireless system for automated liquid reward delivery designed to support home-enclosure experiments on reward access distribution in common marmosets (Callithrix jacchus) with minimal human intervention. The apparatus combines a modular peristaltic pump with a Bluetooth-controlled trigger, allowing spatial separation between the site of action and the site of reward delivery while preserving group housing. We provide detailed design files, software, and assembly instructions to support reproduction and adaptation. In a proof-of-concept deployment across seven families, animals readily engaged with the device, producing a median of ~87 trigger activations per session. Engagement was concentrated early within sessions and remained largely stable across repeated deployments, including under increased action-reward separation. These results established the e-Seesaw as a flexible and reproducible platform for automated reward-delivery experiments in animals tested within their social groups, while reducing human involvement and avoiding fixed dyadic testing.

18

LizardLens: A Two-Stage Deep Learning Pipeline for Detecting and Classifying Similar Species in Visually Complex Environments

Chia, W. H.; Jahanshahi, I.; Loh, L. Y.; Zheng, A.; Verma, N.; Mussman, S.; Shi, B.; Stroud, J. T.

2026-06-12 ecology 10.64898/2026.06.10.731342 medRxiv

Top 0.1%

18.7%

Show abstract

Community science platforms like iNaturalist generate unprecedented volumes of biodiversity data, but their scientific utility depends critically on accurate species identification--a persistent challenge when contributors often lack taxonomic expertise. We developed "LizardLens", a two-stage machine learning pipeline that decouples object detection from species classification to enable fine-grained identification of morphologically similar organisms in visually complex field photographs. Using 10,000 verified iNaturalist images of five Anolis lizard species in Florida, we trained specialized YOLO-based detection and Swin Transformer classification models and compared performance against state-of-the-art single-stage architectures. Our two-stage pipeline achieved 83.0% Top-1 accuracy and a macro-averaged F1-score of 89.0%, indicating strong precision-recall performance across species and outperforming single-stage YOLOv8 and YOLOv12 models across all evaluation metrics for all species, with relative improvements ranging from 10.5% to 13.2%. Gradient-weighted Class Activation Mapping (Grad-CAM) indicated that the models predictions were consistently associated with regions corresponding to diagnostic morphological (e.g., head shape, feet, and limb lengths) and pattern features (e.g., ocular rings and body patterning), providing evidence that LizardLens leverages biologically relevant visual cues consistent with those used by expert taxonomists. Error analysis identified partial occlusion and multiple proximate individuals as primary sources of missed detections, while spurious detections of lizard-like environmental features (e.g., sticks, bark) represented the dominant false positive error mode. We deployed LizardLens as an accessible web application featuring interactive bounding box correction, ranked species predictions with confidence scores, directly supporting the "Lizards on the Loose" middle school community science initiative. By combining technical advances in fine-grained visual classification with user-centered design, LizardLens demonstrates how machine learning can simultaneously enhance data quality for biodiversity monitoring and provide authentic scientific experiences for student participants. Our approach is generalizable to other small-bodied organisms in complex habitats and provides a framework for translating computer vision advances into practical tools for community science and conservation.

19

ctSpyderFields: A Python package for visual field reconstruction in spiders

De Agro, M.; Caradonna, D.; Pande, A.; Falotico, E.; Sumner-Rooney, L.

2026-05-29 bioinformatics 10.64898/2026.05.28.728173 medRxiv

Top 0.1%

18.6%

Show abstract

1The measurement of visual fields in arachnology has a long-standing history. Given the wide variety of eye positions, orientation and structure, the topic is fundamental for studies of taxonomy, evolution, ecology and behavior. The existing methods for measuring visual fields deploy ophthalmoscopic measurements, which require custom microscopes, anatomical structures like the reflective tapetum, which may not always be present, or the capacity to detect photoreceptor autofluorescence. Here we present the ctSpyderFields python package: a tool for geometrically predicting the visual fields of arachnids from digital images of the lens and retina. The tool uses images coming from computed tomography (CT) scans of specimens, but could be applied to other 3D microscopy techniques, to virtually project the boundaries of the retina through the geometrically predicted nodal point of the lens, deriving a rough per-eye visual field both in cartesian and spherical coordinates. The extracted data can then be used to calculate likely visual field overlap between eyes and angular spans, which can be compared within or between species. We also provide a use case, reporting the visual field data extracted from a museum specimen of Philaeus crysops. We propose that the tool will allow a wider comparative analysis of visual fields across spider species, unlocking the potential for a deeper understanding of visual ecology and evolution.

20

Standardizing image-derived fish length-frequency distributions to reference measurements using bin-specific error matrices

Shibata, Y.; Iwahara, Y.; Hino, H.; Tsukada, A.; Kisara, Y.; Nishino, T.; Endo, H.

2026-07-06 ecology 10.64898/2026.07.06.736664 medRxiv

Top 0.1%

18.6%

Show abstract

Artificial intelligence (AI)-based image analysis can efficiently estimate fish length, but differences in devices, imaging conditions, operators, and AI models limit comparability among surveys. We propose a standardization framework that estimates a bin-specific error matrix from paired reference measurements and AI-derived lengths and applies it to standardize (correct) AI-derived length-frequency distributions. The Richardson-Lucy expectation-maximization algorithm was used, with the number of iterations selected via cross-validation. Simulations based on empirical length-frequency data from 110 species showed that standardization reduced relative bias and distributional discrepancy; median relative-bias and root mean square error ratios were below 1, and the performance was more affected by the amount of paired data than by the number of cross-validation folds. In real data from 957 Japanese jack mackerel, standardized AI-derived distributions approached human-observer histograms, although discrepancies remained in the range of 160-230 mm. The proposed framework provides a practical approach for improving the comparability of image-derived length-frequency data using paired calibration data, without retraining the underlying AI model.